This vignette gives an overview of the
USTerritoryMapping R package, which seeks to make creating
categorical choropleth maps of the US that include the US territories a
little bit easier!
First load the package.
To use this package, you will need to have a data frame with two columns:
fipscodes.rda is provided to facilitate #2.For this vignette, we’ll be using the two provided datasets
census.uninsured19 and cdc.cvd.
census.uninsured19 provides an example of a dataset with
complete data for all 50 states, D.C., and the 5 US territories. It is
already in the proper format for the provided package functions.
cdc.cvd is missing values for territories and requires
additional processing which we will demonstrate below.
We can see that census.uninsured19 has these two
components: 1. “Percent.Cat”: the Percentage Ages 19 or Under with No
Health Insurance categorized as a factor 2. “STUSPS”: the two letter US
Postal Service code
In cdc.cvd we are missing the territories and our fill
variable (Data_Value) has not yet been prepared as a factor.
We’ll first join the dataset to the provided fips_code
dataset to get the full list of jurisdictions. Then we’ll code a new
factor variable for mapping.
data("fips_codes_state")
cdc.cvd <- fips_codes_state %>%
left_join(cdc.cvd, by = c("state" = "LocationAbbr")) %>%
mutate(data.cat = factor(
case_when(
Data_Value < 198 ~ "Q1 (166 to < 198)",
Data_Value >= 198 & Data_Value < 215 ~ "Q2 (198 to < 215)",
Data_Value >= 215 & Data_Value < 248 ~ "Q3 (215 to < 248)",
Data_Value >= 248 & Data_Value < 400 ~ "Q4 (248 to 326)",
is.na(Data_Value) ~ "Data Not Available"
),
levels = c("Q1 (166 to < 198)", "Q2 (198 to < 215)",
"Q3 (215 to < 248)", "Q4 (248 to 326)", "Data Not Available")
)
)
table(cdc.cvd$data.cat)
#>
#> Q1 (166 to < 198) Q2 (198 to < 215) Q3 (215 to < 248) Q4 (248 to 326)
#> 13 13 12 13
#> Data Not Available
#> 6Start by defining the fill category colors with their factor labels.
colors.census <- c("Less than 5%" = "#feebe2",
"5% to <10%" = "#f768a1",
"10% or Greater" = "#7a0177")Then specify any required parameters of the function (see documentation for details).
map1_categorical(data = census.uninsured19,
join_var = "STUSPS",
fill_var = "Percent.Cat",
fill_color = colors.census,
legend_name = "Percent Uninsured",
territory_label_color = "black",
title = "Figure 1. Percent Uninsured, Ages <19 Years",
save.filepath = "saved-maps/map1-uninsure.png")Let’s say we wanted to add a border to highlight specific states or territories. We’ll first define a vector of US postal service IDs (in this example, Oregon, Wisconsin, Virginia, and USVI) and then feed this into the border_ids parameter.
border <- c("OR", "WI", "VA", "VI")
map1_categorical(data = census.uninsured19,
join_var = "STUSPS",
fill_var = "Percent.Cat",
fill_color = colors.census,
legend_name = "Percent Uninsured",
title = "Figure 1. Percent Uninsured, Ages <19 Years",
border_ids = border,
border_color = "red",
border_linewidth = 1,
save.filepath = "saved-maps/map1-uninsure2.png")Sometimes we may want to remove the inset box outline, which we can
do by specifying inset_box_color = "white".
We also highlight an additional option of removing the territory
labels by specifying territory_label_color = "white".
colors.cdc <- c("Q1 (166 to < 198)" = "#ffffcc",
"Q2 (198 to < 215)" = "#a1dab4",
"Q3 (215 to < 248)" = "#41b6c4",
"Q4 (248 to 326)" = "#225ea8",
"Data Not Available" = "grey80")
map1_categorical(data = cdc.cvd,
join_var = "state",
fill_var = "data.cat",
fill_color = colors.cdc,
fill_linewidth = 1.2,
fill_linecolor = "black",
inset_box_color = "white",
territory_label_color = "white",
legend_name = "CVD Mortality Rate\nper 100,000 persons",
border_ids = border,
border_color = "red",
border_linewidth = 1.5,
save.filepath = "saved-maps/map1-cvd.png") We love maps of the territory geometries, but you might also want a map with the territory labels.
colors.census <- c("Less than 5%" = "#feebe2",
"5% to <10%" = "#f768a1",
"10% or Greater" = "#7a0177")
border <- c("OR", "WI", "VA", "VI")
map2_categorical(data = census.uninsured19,
join_var = "STUSPS",
fill_var = "Percent.Cat",
fill_color = colors.census,
legend_name = "Percent Uninsured",
title = "Figure 1. Percent Uninsured, Ages <19 Years",
border_ids = border,
border_color = "red",
border_linewidth = 1,
save.filepath = "saved-maps/map2-uninsure.png")Note that in the current package version, territory labels cannot be highlighted with a border, even when specified in the border ID vector.
colors.cdc <- c("Q1 (166 to < 198)" = "#ffffcc",
"Q2 (198 to < 215)" = "#a1dab4",
"Q3 (215 to < 248)" = "#41b6c4",
"Q4 (248 to 326)" = "#225ea8",
"Data Not Available" = "grey80")
map2_categorical(data = cdc.cvd,
join_var = "state",
fill_var = "data.cat",
fill_color = colors.cdc,
fill_linewidth = 1.2,
fill_linecolor = "black",
inset_box_color = "white",
legend_name = "CVD Mortality Rate\nper 100,000 persons",
border_ids = border,
border_color = "red",
border_linewidth = 1.5,
save.filepath = "saved-maps/map2-cvd.png") To map at the county-level, you will need to have a data frame with two columns:
tidycensus package has a useful list that
can be loaded and joined to the data frame as needed to facilitate
#2.fips_county <- tidycensus::fips_codes
head(fips_county)
#> state state_code state_name county_code county
#> 1 AL 01 Alabama 001 Autauga County
#> 2 AL 01 Alabama 003 Baldwin County
#> 3 AL 01 Alabama 005 Barbour County
#> 4 AL 01 Alabama 007 Bibb County
#> 5 AL 01 Alabama 009 Blount County
#> 6 AL 01 Alabama 011 Bullock CountyFor this vignette, we’ll be using one of the provided dataset
census.uninsured19.co. census.uninsured19.co
provides an example of a dataset with complete data for all US state and
territory counties and county-equivalent units. It is already in the
proper format for the provided package functions.
County-level data preparation note: If your dataset does not have
complete data for all US state and territory counties, see the data
preparation steps exemplified in the state and territory-level data
preparation example above (cdc.cvd data). The same steps
apply, but using the fips_county data for the join to
obtain all county-level geometry ID’s.
We can see that census.uninsured19.co has these two
components: 1. “Percent.Cat”: the Percentage Ages 19 or Under with No
Health Insurance categorized as a factor 2. “GEOID”: the five number
county FIPS code, coded as a factor.
Start by defining the fill category colors with their factor labels.
colors.census <- c("Less than 5%" = "#feebe2",
"5% to <10%" = "#f768a1",
"10% or Greater" = "#7a0177")Then specify any required parameters of the function (see documentation for details).
Note that the default option for county geometry data year
(county_data_year) is “2020” which provides the 2020 county
geometry. Setting county_data_year = “2010” can be used to
map using the 2010 county geometry file.
map1_categorical_county(data = census.uninsured19.co,
join_var = "GEOID",
county_data_year = "2020",
fill_var = "Percent.Cat",
fill_color = colors.census,
fill_linewidth = 0.5,
fill_linecolor = "gray50",
legend_name = "Percent Uninsured",
title = "Figure 1. Percent Uninsured, Ages <19 Years",
state_color = "black",
state_linewidth = 1,
save.filepath = "saved-maps/map1-uninsure-co.png")